A Coarse-Grained Lattice Model for Molecular Recognition 
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We present a simple model which allows to investigate equilibrium aspects of molecular recognition 
between rigid biomolecules on a generic level. Using a two-stage approach, which consists of a design 
and a testing step, the role of cooperativity and of varying bond strength in molecular recognition 
is investigated. Cooperativity is found to enhance selectivity. In complexes which require a high 
binding flexibility a small number of strong bonds seems to be favored compared to a situation with 
many but weak bonds. 

PACS numbers: 87.15.-v, 87.15.Aa, 89.20.-a 



Living organisms could not function without the abil- 
ity of biomolecules to specifically recognize each other 
mm. Molecular recognition can be viewed as the ability 
of a biomolecule to interact preferentially with a par- 
ticular target molecule among a vast variety of differ- 
ent but structurally similar rival molecules. Recognition 
processes are governed by an interplay of non-covalent 
interactions, in particular, hydrophobic interactions and 
hydrogen bonds. Such non-covalent bonds have typical 
energies of 1-2 kcal/mole (the relatively strong hydro- 
gen bonds may contribute up to 8-10 kcal/mole) and are 
therefore only slightly stronger than the thermal energy 
^bIrooih — 0.62 kcal/mole at physiological conditions. 
Biomolecular recognition is thus only achieved if a large 
number of functional groups on the two partner molecules 
match precisely. This observation has lead to a "key- 
lock" picture: Two biomolecules recognize each other if 
their shapes at the recognition site and/or the interac- 
tions between the residues in contact are largely comple- 
mentary Q. 

In the present Letter, we introduce a coarse-grained 
approach which allows to investigate this "principle of 
complementarity" on a very general level, and use it to 
study the role of different factors for the selectivity of 
interactions between biomolecule surfaces. Specifically, 
we analyze two elements that have been discussed in the 
literature: the cooperativity, and the interplay of inter- 
action strengths. We will show that our model can help 
to understand some of the features of real protein-protein 
interfaces. 

Previous theoretical studies have mostly dealt with the 
adsorption of heteropolymers on random and structured 
surfaces 0. 0- Some works have adapted the random 
energy model from the theory of disordered systems to 
the problem of biomolecular binding 0, 0|- In contrast, 
in the present approach, we consider explicitly systems 
of two interacting, rigid, heterogeneous surfaces. This is 
motivated by some basic findings about the biochemical 
structure of the recognition site, i.e., the contact inter- 
face between recognizing proteins. In recent years the 
structural properties of proteins at the recognition site 
has been clarified (2, |^ LLfJ . Although different protein- 



protein complexes may differ considerably, a general pic- 
ture of a standard recognition site containing approxi- 
mately 30 residues, with a total size of 1200-2000 A 2 has 
emerged. Apart from notable exceptions, the association 
of the proteins is basically rigid, although minor rear- 
rangements of amino acid side-chains do occur 0, 0] . 

We describe the structure of the proteins at the con- 
tact interfaces by two sets of classical spin variables 
a = (oi, . . . , <7jv) and 8 = {9\, . . . , On), whose values 
specify the various types of residues. The set <j char- 
acterizes the structure of the recognition site on the tar- 
get molecule, and 9 that on the probe molecule, i.e., the 
molecule that is supposed to recognize the target. The 
position of site i on the surfaces can be specified arbi- 
trarily. For simplicity, we assume that the positions i on 
both surfaces match, and that the total number of con- 
tact residues is equal N for both molecules. However, 
we take into account the possibility that the quality of 
the contact of two residues at position i may vary, e.g., 
due to steric hindrances or varying relative alignment of 
polar moments, caused by minor rearrangements of the 
amino acid side-chains. This is modeled by an additional 
variable Si,i = 1, . . . , N. The total interaction is thus de- 
scribed by a Hamiltonian H.(cr, 9; S), which incorporates 
in a coarse-grained way both the structural properties of 
the recognition site and the interaction between residues. 

To study the recognition process between two 
biomolecules, we adopt a two-stage approach. We take 
the structure of the target recognition site, = 
{o± , . . . , crjy ), to be given. In the first step, the probe 
"learns" the target structure at a given "design tempera- 
ture" 1//3 D . One obtains an ensemble of probe molecules 
with structures 9 distributed according to a probabil- 
ity P(%(°>) = ^ £ s cxp (-/3 D H(cr(°\ 6>; S)) , which de- 
pends on the target structure. This first design step is 
introduced to mimic the design in biotechnological ap- 
plications or the evolution process in nature. The pa- 
rameter f3 D characterizes the conditions under which the 
design has been carried out, i.e., it is a Lagrange pa- 
rameter which fixes the achieved average interaction en- 
ergy. A similar design procedure has been introduced 
in studies of protein folding 0] and the adsorption of 
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polymers on structured surfaces [l2j. In the second step, 
the recognition ability of the designed probe ensemble 
is tested. To this end the probe molecules are exposed 
to both the original target structure and a compet- 
ing (different) rival structure at some temperature 
which in general differs from the design tempera- 
ture 1//3 D . The thermal free energy F(6\a^) for the 
interaction between (a = 0, 1) and a probe 8 is 

given by F(%( Q )) = -± ln£ s exp {^l3H{a^\ 6; S)) . 

Averaged over all probe molecules, we obtain (F^) = 
J2 e F(e\a^)P{e\a^). The target is recognized if the 
average free energy difference AF = {F^) — (F^- 1 ') is 
negative, i.e., probe molecules exposed to equal amounts 
of target and rival molecules preferentially bind to the 
target. Note that our treatment does not account for 
kinetic effects, only equilibrium aspects are considered. 

The association of the proteins is accompanied by a re- 
duction of the translational and rotational entropy. How- 
ever, these additional entropic contributions to the free 
energy of association depend only weakly on the mass 
and shape of the rigid molecules, and can be considered, 
in a first approximation, to be of the same order for the 
association with the target and the rival molecule. Thus, 
these contributions cancel in the free energy difference. 
Similarly, contributions from the interaction with solvent 
molecules are also assumed to be of comparable size. 

A modified HP-model can serve as a first example to il- 
lustrate this general description. In the HP-model, which 
was introduced originally to study protein folding [l^ . 
residues are distinguished by their hydrophobicity only. 
Hydrophobic residues are represented by er.;, 0, = +1, and 
polar residues by (Xi,8i = — 1. In addition, the variable 
Si describing the (geometric) quality of the contact can 
take on the values ±1 where Si = +1 models a good 
contact and Si = — 1 a bad one. Only for good contacts 
does one get a contribution to the binding energy. The 
Hamiltonian is then given by 

H(*AS) = -eY±^-a i e i (1) 

i 

where the sum extends over the N positions of the 
residues of the recognition site and e being the inter- 
action constant ^J. Note that a "good" contact can 
nevertheless lead to an unfavorable energy contribution. 
For this simple model, the different steps of the two-stage 
approach described above can be worked out analytically. 

First, we analyze the efficiency of the design step by 
inspecting the achieved complementarities (of interac- 
tions) of the designed probe molecules with the target 
molecule. To this end, we define a complementarity pa- 
rameter K = J2i a i°^i which ranges from —TV to +N, 
with K close to +N signaling a large complementarity 
of the recognition sites. The probability distribution 
P(9\a^) can be converted to a distribution P{K) for 
the probability of having a complementarity K . Up to a 



normalization factor, it is given by 

^ x (v.n + k))"(^ k )- (2) 

Its first moment (K) = J2 K KP ( K ) = Ntsmh (e/3 D /2) 
quantifies the quality of the design. For decreasing de- 
sign temperatures 1//3 D the average complementarity per 
site (K) /N approaches one, and thus the designed probe 
molecules are well optimized with respect to the target. 

In the second step the association of the probe 
molecules with the target and with a different rival 
molecule is compared. Introducing the quantity Q = 
J2i a i a i as a measure for the similarity between the 
recognition sites of the target and the rival molecules, 
the free energy difference per site can be expressed in the 

form AF(Q)/N = -±etanh Mf^ (1 - Q/N). AF/N is 
negative, if the rival and the target are different and Q is 
thus smaller than N. The probe molecule therefore binds 
preferentially to the target molecule, and thus the target 
is specifically recognized. The free energy difference in- 
creases with decreasing similarity parameter Q. 

After this introductory analysis of a simple system, 
we turn to consider more complex models which allow to 
investigate the influence of different factors on the specific 
recognition between surfaces. We begin with studying 
the role of cooperativity. 

Systematic mutagenesis experiments have revealed 
that cooperativity plays an important role in molecu- 
lar recognition processes ^^|. Cooperativity in biological 
processes basically means that the interaction strength of 
two residues depends on the interactions in their neigh- 
borhood. Physically, this can be caused by a physical re- 
arrangement of amino acid side-chains or a readjustment 
of polar moments as a function of the local environment. 
In the simplified language of our model, cooperativity 
thus means that the quality of a contact depends on the 
quality of the neighbor contacts. This can be incorpo- 
rated in the HP-model by the following extension: 

N 1 + <?• 

H(a,6;S) = -sJ2^ ± ^i-jJ2 SiS ^ ^ 

* = 1 " (ij) 

The second sum accounts for the cooperative interaction 
and runs over neighbor residue positions i and j. The in- 
teraction coefficient J is positive for cooperative interac- 
tions and negative for anti-cooperativity. For J > 0, the 
cooperative term rewards additional contacts in the vicin- 
ity of a good contact between two residues. This leads to 
a better optimization of the side-chains and thus the com- 
plementarity between the probe and the target molecule 
is improved. Cooperativity is therefore expected to en- 
hance the quality of the design step compared to an in- 
teraction without cooperativity. Similarly, one expects 
an improved recognition specificity. 
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FIG. 1: Average complementarity per site of the designed 
probe ensemble for different values of J. For the lower dashed 
curve J = 0, the upper dashed line represents the limit J — ► 
oo, which can be tackled analytically |Tfj |. The curves in 
between from bottom up belong to values 0.1, 0.25, 0.5, 0.75 
of J in units of e. The inset shows (K) /N for N = 256 (full 
curve) and N = 36 (dashed line) with J = e/2. Only minor 
finite-size effects are visible. 



For non-zero, but finite values of J, the model can no 
longer be solved analytically. Therefore, we calculated 
numerically the density of states for the interaction be- 
tween two proteins as a function of the energy and the 
complementarity parameter using efficient modern Monte 
Carlo algorithms |lfj(. The density of states Qj(K, E) for 
a fixed target structure is the number of configura- 
tions (9, S) that have energy E — Ji when interacting 
with the target, and a complementarity K with the tar- 
get recognition site. The probability distribution of the 
complementarity K is then (up to a normalization con- 
stant) given by P 0B {K; J) ~ J2 E Qj(K, E) exp(-/3 D £). 

For simplicity, we consider asymptotically large inter- 
faces on a square lattice. (The actual calculations shown 
here were carried out with TV = 256, and we checked 
that the results do not change any more for larger N). 
Fig. shows the average complementarity (K) /N for 
different cooperativities J. Cooperativity is found to in- 
crease the average complementarity of the designed probe 
molecules for large enough values of the parameter e/3 D . 
For £/3 D ~ 1, a small change in the cooperativity J 
leads to a large difference in the average complementar- 
ity, i.e., small changes in J can have a large impact on 
the recognition process. As e is typically of the order 
of 1 kcal/mole, this regime indeed corresponds to phys- 
iological conditions for reasonable design temperatures, 
1//?d ^ l/Z^Room- Fig. |21 shows the free energy difference 
per site AF(Q)/N of the association of probe molecules 
with the target structure and a rival structure, for dif- 
ferent values of the cooperativity constant J. Increas- 
ing the cooperativity increases the free energy difference. 
Relatively small cooperativities are sufficient to obtain 
an effect, and the maximum effect of cooperativity is al- 
ready reached for a value J ~ e. Thus, we find that 
cooperativity indeed improves the recognition ability as 
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FIG. 2: Free energy difference per site (in arbitrary units) 
of the association of the probe ensemble with the two com- 
peting molecules as a function of their similarity for different 
cooperativities J (with f3 D = (3 — 0.5). For the upper dashed 
line J = 0, the lower dashed line describes the limiting case 
J — > oo for Q/N close to one 01 . The full curves from top 
to bottom correspond to the same values of J as in Fig. U 



expected for cooperativity constants J ~ e. The above 
findings were obtained for large interfaces. Although mi- 
nor finite-size effects are visible for interfaces of realis- 
tic size (with N ~ 0(30)) the general findings discussed 
above still hold qualitatively (compare inset of Fig. 

In situations where one molecule is flexible conforma- 
tional changes occur. However, cooperativity works on 
the level of residue interactions and thus we expect that 
the favorable effect of cooperativity to molecular recogni- 
tion is not spoilt by the entropic contributions due to re- 
folding. This, however, needs further investigation. Note 
that flexible binding has been addressed recently [l7j . 

Next, we investigate the role of the interplay of interac- 
tions for molecular recognition. This study is motivated 
by the observation that antibody-antigen interfaces have 
very specific properties. Mutagenesis studies have re- 
vealed that the structural interface in these complexes 
is different from the functional recognition site made up 
of those residues that contribute to the binding energy. 
Only approximately one quarter of the residues at the 
interface contribute considerably to the binding energy 
0,0] . These contributing residues are sometimes called 
"hot spots". In addition it has been shown that antigen- 
antibody interfaces are less hydrophobic, compared to 
other protein-protein interfaces, so that the relatively 
strong hydrogen bonds are more important In the 

immune system molecular recognition must satisfy very 
specific requirements. The immune system has to recog- 
nize substances that have never been encountered before. 
Thus antigen-antibody recognition has to exhibit a large 
flexibility and has to be able to adapt very rapidly 
by evolution. These peculiarities of antibody-antigen in- 
terfaces suggest that selective molecular interactions are 
obtained most efficiently with only a few strong interac- 
tions across the interface, so that a complementarity with 
the whole recognition site is not necessary. 
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FIG. 3: Averaged free energy difference per site (in arbitrary 
units) as a function of the fraction A/N of active residues for 
ey^A/N = 0.1. The full curve corresponds to a ratio /3//3 D = 1, 
the dashed curve to /3//3 D = 1/2. 



Within our two-stage approach, we can address the 
question whether few but strong bonds or many but weak 
bonds are more favorable. To this end we consider a 
model which distinguishes between active and inactive 
residues only. Only active residues contribute to a bond. 
The variables a and 9 now take on the values ai, 6i = +1 
for active and erj,0, = for inactive residues, and the 
Hamiltonian is given by 

N 1 + <?• 

H(a, 6, S) = -e H J2 ( 4 ) 

i=l 

with Si specifying again the quality of the contact of 
residues, and e H giving the interaction strength. More- 
over, we extend the design step by fixing the average 
number of active residues A = {J^i #i) on the probe 
molecules with a Lagrange parameter. The total inter- 
action energy E is also subject to restrictions: It has to 
exceed the thermal energy to stabilize the complex, but 
on the other hand it has to be small enough to ensure the 
high flexibility of the target-probe complex that is cru- 
cial for the immune system. When increasing the average 
number of active residues A, one must therefore reduce 
the interaction energy £h accordingly, e.g., by keeping 
the product E ~ Aeu constant. 

Figure El shows as a function of A/N the average free 
energy difference per site AF/N of the association with 
the target molecule and a rival molecule, averaged over 
all possible target and rival structures a. We find that 
(AF) exhibits a minimum at a small fraction A/N of 
active residues. The position of the minimum at small 
fractions of A/N is fairly insensitive to a variation of the 
interaction parameters. Hence this simple coarse-grained 
model already predicts that molecular recognition is most 
efficient if the functional recognition site consists only of 
a small fraction of the structural recognition site, as is 
indeed observed in antibody-antigen complexes. 

In conclusion, we have presented coarse-grained mod- 
els which allow to study generic features of biomolecular 



recognition. A two-stage approach which distinguishes 
between the design of probe molecules and the test of 
their recognition abilities has been adopted. We have 
applied the approach to investigate the role of coopera- 
tivity and of hydrogen bonding for molecular recognition. 
It turned out that cooperativity can substantially influ- 
ence the efficiency of both design and recognition ability 
of recognition sites. Our model also reproduces the ob- 
servation that the structural recognition site has to be 
distinguished from a functional recognition site in highly 
flexible complexes such as antigen-antibody complexes. 

The approach can readily be generalized to study other 
aspects of molecular recognition. For example, it will be 
interesting to investigate the influence of the heterogene- 
ity of the mixture of target and rival molecules in physi- 
ological situations. This can be incorporated by consid- 
ering ensembles of targets and rivals differing in certain 
properties as for example correlations and length scales. 
A recent study indeed showed that the local small-scale 
structure of molecules seems to be important for molec- 
ular recognition Q. 

Financial support of the Deutsche Forschungsgemein- 
schaft (SFB 613) is gratefully acknowledged. 



[1] B. Alberts et. al, Molecular Biology of the Cell, Garland 

Publishing, Inc., New York, 1994. 
[2] C. Kleanthous, ed., Protein- Protein Recognition, Oxford 

University Press, Oxford, 2000. 
[3] L. Pauling, M. Delbriick, Science 92, 77 (1940). 
[4] A. K. Chakraborty, Phys. Rep. 342, 1 (2001). 
[5] A. Polotsky, A. Degenhard, F. Schmid, J. Chem. Phys. 

120, 6246 (2004); 121, 4853 (2004). 
[6] T. Bogner, A. Degenhard, F. Schmid, Phys. Rev. Lett. 

93, 268108 (2004). 
[7] J. Janin, Proteins 25, 438 (1986). 

[8] J. Wang, G. M. Verkhivker, Phys. Rev. Lett. 90, 188101 
(2003). 

[9] S. Jones, J. M. Thornton, Proc. Natl. Acad. Sci. USA, 

93, 13 (1996). 

[10] L. Lo Conte, C. Chothia, J. Janin, J. Mol. Biol. 285, 2177 
(1999). 

[11] V. S. Pande, A. Yu. Grosberg, T. Tanaka, Rev. Mod. 

Phys. 72, 259 (2000). 
[12] A. Jayaraman, C. K. Hall, J. Genzer, Phys. Rev. Lett. 

94, 078103 (2005). 

[13] K. A. Dill, Biochemistry 24, 1501 (1985). 

[14] Note that the original HP-model docs not contain an ad- 
ditional variable S to model the quality of contacts. 

[15] E. di Cera, Chem. Rev. 98, 1563 (1998). 

[16] A. Hiiller, M. Pleimling, Int. J. Mod. Phys. C 13, 947 
(2002); F. Wang, D. P. Landau, Phys. Rev. Lett. 86, 
2050 (2001) 

[17] J. Wang, Q. Lu, H. P. Lu, PLoS Comput. Biol. 2, e78 
(2006). 

[18] H. Behringer, A. Degenhard, F. Schmid, unpublished. 
[19] B.C.Cunningham, J.A.Wells, J. Mol. Biol. 234, 554 
(1993). 



